Main
Alexandre Henrique S. Dias
I’m a Full-Time Data Scientist and a MSc student in Electrical and Computer Engineering at UFRN. My research is focused on social network analysis, graph theory, and Natural Language Processing. Additionaly, the main programming languages I use are Python, R, C++, and SQL. Besides, my favorite ML FrameWorks are Scikit-Learn and TensorFlow. Lastly, I also have skills in MLOps using GKE, Kubeflow, Kubernetes, and Docker.
Industry Experience
Data Scientist
Americanas S.A.
São Paulo, SP
Present - 2021
- Responsible for building ML models using: Python, Scikit-Learn, and Tensorflow. Apply ML to a wide range of topics, such as Complex Network Analysis, Social Networks, NLP, and HR Analytics.
- Create ML pipelines using KubeFlow Pipelines from Google Cloud AI Platform, and participate in the design of CI/CD operations of ML models.
Data Scientist
Looqbox
São Paulo, SP
2021 - 2019
- Development of BI reports and dashboards using R, Python, and SQL.
- Maintainer of the Looqbox R Package used to build R objects and data structures compatible with the Looqbox Application.
Education
M. Sc., Electrical and Computer Engineering
UFRN - Federal University of Rio Grande do Norte
Natal, RN
Present - 2021
- Research in complex network analysis, social networks, graph theory, and NLP.
- Tools: python, networkX, gephi, TensorFlow, WandB, Git.
MITx Micromaster Program in Statistics and Data Science
MITx on EdX
EdX
2022 - 2020
- The MITx MicroMaster Program in Statistics and Data Science covers the fundamentals of data science, statistics, and machine learning.
B. Sc., Computer Engineering
UFRN - Federal University of Rio Grande do Norte
Natal, RN
2019 - 2018
- Researcher and member of the Modeling and Scientific Data Analysis team.
B. Sc., Sciences & Technology
UFRN - Federal University of Rio Grande do Norte
Natal, RN
2017 - 2015
- Linear Algebra and Analytical Geometry Teacher Assistant.
- Calculus II Teacher Assistant.
Certificates & Courses
MicroMasters in Statistics and Data Science
MITx on EdX
N/A
2022 - 2020
- 6.431x: Probability - The Science of Uncertainty and Data.
- 18.6501x: Fundamentals of Statistics.
- 6.86x: Machine Learning with Python - From Linear Models to Deep Learning.
- 14.310x/Fx: Data Analysis in Social Science.
- DS.CFx: Capstone Exam for Statistics and Data Science.
MLOps (Machine Learning Operations) Fundamentals
Coursera
N/A
2021
DataCamp completed tracks
DataCamp
N/A
2019 - 2018
- Data Scientist with Python.
- Data Analyst with Python.
- Data Manipulation with Python.
- Machine Learning with Python.
- Importing & Cleaning Data with Python.
- Python Programming.
- Python Programmer.
Research Experience
Undergraduate Researcher
Digital Metropolis Institute
UFRN
2019 - 2018
- Developed a traffic monitoring system using image recognition techniques.
Undergraduate Researcher
Department of Informatics and Applied Mathematics
UFRN
2017 - 2016
- Developed an interactive theorem prover based on Linear Logic using the Maude programming language.
Academic Publications
Paper published in the 2019 II Workshop on Metrology for Industry 4.0 and IoT (MetroInd4.0&IoT). Naples, Italy.
Performance Evaluation of an Edge OBD-II Device for Industry 4.0
Institute of Electrical and Electronics Engineers
IEEE
2019
- Performance evaluation of an Edge OBD-II device that collects data from vehicles in an autonomous way in order to provide customer feedback and tracking
Selected Data Science Writing
I enjoy reading about productivity, lifestyle, data science/AI, and statistics.
Dimensionality Reduction with Factor Analysis on Student Performance Data
N/A
2021
- A dimensionality reduction technique with interpretable outputs.
Stop Using the Elbow Method
N/A
2021
- Silhouette Analysis: A more precise approach to finding the optimal number of clusters using K-Means.
Scikit-Learn 1.0 - A true milestone
N/A
2021
- An overview of the design principles of Scikit-Learn and how the famous ML library became so popular.
The Expectation-Maximization (EM) Algorithm
N/A
2021
- Understanding the motivations and how the EM Algorithm works.
A mathematical derivation of the Law of Total Variance
N/A
2020
- Understanding what is and when to apply the Law of Total Variance.
Clustering with K-means: simple yet powerful
N/A
2019
- Explain what is Cluster Analysis, and how the K-means algorithm work providing its pros and cons.
An introduction to Linear Regression
N/A
2019
- Explain all assumptions behind Linear Regression, how to measure its performance, and how to implement it in Python.